First, open a small sample
setwd("~/Desktop/taxis/")
df <- read.csv("data/sample.csv", nrows = 200)
Open up the shape files, to get the NYC neighborhoods in the data frame (from the longitude/logitude pairs)
Functions for getting neighborhood from Google Maps API

Here’s a function to build a URL to call the google maps API

url <- function(latitude, longitude, return.call = "json", sensor = "false", result_type = "neighborhood", api_key = "") {
  root <- "https://maps.google.com/maps/api/geocode/"
  u <- paste(root, return.call, "?latlng=", latitude, ',', longitude,'&result_type=', result_type, '&key=', api_key, sep = "")
  return(URLencode(u))
}
my_key <- "AIzaSyCokPde2Hapa7t7grSQEfM5LEeC8SsNwX8"
geoCode <- function(latitude, longitude, verbose=FALSE) {
  if(verbose) cat(latitude,longitude,"\n")
  u <- url(latitude = latitude, longitude =longitude, api_key = my_key)
  doc <- getURL(u)
  x <- fromJSON(doc,simplify = FALSE)
  if(x$status=="OK") {
    lat <- x$results[[1]]$geometry$location$lat
    lng <- x$results[[1]]$geometry$location$lng
    location_type  <- x$results[[1]]$geometry$location_type
    neighborhood  <- x$results[[1]]$address_components[[1]]$long_name
    Sys.sleep(0.5)
    return(neighborhood)
  } else {
        print(paste("ERROR: status:", x$status, 'url:',  url(latitude = latitude, longitude =longitude, api_key = ""), sep = " "))
    return(NA)
        
  }
}

Test output

geoCode(40.714224,-73.961452)

now, I’m going to need to add the neighborhood for each row in the dataframe. Before I deal with that, let’s see if this vectorizes nicely using dplyr and purrr


vect.geoCode <- Vectorize(geoCode, vectorize.args = c("latitude", "longitude"))

Testing

vect.geoCode(df$dropoff_latitude[1:10], df$dropoff_longitude[1:10])

Now let’s see if this works dplyr.

df %>% top_n(10) %>% mutate( neighborhood = vect.geoCode(pickup_latitude, pickup_longitude))

Looking at where this breaks - All the NA values in these 10 examples are JFK airpoirt. I’ll need to fill these in later.


Normalizing data

df.large <- read.csv("data/train.csv")
nrow(df.large)
[1] 1458644

taking a look at the distribution of durations:

df.large %>% ggplot(aes(x = trip_duration)) + geom_histogram()

WAAAY too skewed. The duration is in seconds. 3,000,000 seconds? That’s an 83 hour taxi ride. No way.

For my visualization purposes, I’m interested in “regular” rides - rides that are comparable so that the visualization is meaningful. Withougt thinking too much, I’ll limit rides to those under two hours long.

df.large %>% filter(trip_duration < 7200) %>% ggplot(aes(x = trip_duration)) + geom_histogram()

# limit to trips of two ours in duration
df.large <- df.large %>% filter(trip_duration <= 7200) %>%
      mutate(trip_duration = trip_duration/60) # Change time units to minuts, not seconds

Now to see if the number of passengers makes sense:

df.large %>% group_by(passenger_count) %>% summarize(count = n())

I’ll only keep records with between 1 and 6 passengers.

df.large <- df.large %>% filter(between(passenger_count, 1, 6))
df.large %>% mutate(date = date(pickup_datetime)) %>% group_by(date) %>% 
      summarize(num.records = n()) %>%
      ungroup() %>%
      ggplot(aes(x = date, y = num.records)) + geom_point() + geom_line()

There seems to be one weird day, but I’m not too worried about that. The data seems to be pretty consistent over the days in the dataset.

Adding halfway latitude/longitudes

It might be interesting to see where the halfway points are for each ride. As a crude heuristic, I’ll take the vector average of the pickup and dropoff locations.

df.large <- df.large  %>% mutate(halfway_latitude = (pickup_latitude + dropoff_latitude)/2, 
                     halfway_longitude = (pickup_longitude + dropoff_longitude)/2)
Adding the neighborhoods

I’ve defined a vectorized function vec.geoCode for calling the Google Maps Api and getting the neighborhood given a latitude or longitude.

With over 1,000,000 rows, however, I can’t call the API for every row. I’ll need to round the latitude/longitude coordinates and group based on these rounded coordinates, and only call the API for the aggregates

A rounded latitude/longitude pair to the 3rd decimal point covers around 110 square meters. New York City covers aroud 789 squre kilmeters. Therefore if I round to the 3rd decimal point and group by lat-long pairs, I should expect around 7,000 unique rounded paris…

df.large %>% mutate(latitude = round(pickup_latitude, 3), longitude = round(pickup_longitude, 3)) %>%
      group_by(latitude, longitude) %>%
      summarize(num.occurneces = n())

13,000 rows. Each Row takes around .6 seconds… That would take around 2 hours (assuming Google doesn’t close the connection.

If I instead round to 2 decimal places (each lat-long pair should cover aroudn 1.1 kilometers):

df.large %>% mutate(latitude = round(pickup_latitude, 2), longitude = round(pickup_longitude, 2)) %>%
      group_by(latitude, longitude) %>%
      summarize(num.occurneces = n())

I get only 1,000 rows.

I’ll use this less precise representation. Most likey, an error of at most 1.1 kilometers will not cause an error in the neighborhood classification. Also, what’s more interesting, really, are qeustions like: “are people going downtown on friday nights?” or “Do people take taxis to Brooklyn home from work?

neighborhoods <- bind_rows(
unique(df.large %>% mutate( latitude = round(pickup_latitude,2), longitude = round(pickup_longitude, 2)) %>%
      select(latitude, longitude)),
unique(df.large %>% mutate( latitude = round(dropoff_latitude,2), longitude = round(dropoff_latitude, 2)) %>%
      select(latitude, longitude))
)
#time execution
startime <- Sys.time()

#Add pickup neighborhood
neighborhoods <- neighborhoods %>%
      mutate(neighborhood = vect.geoCode(latitude, longitude))

endtime <- Sys.time()
startime - endtime
neighborhoods

It worked! I’ll write this to disk to be safe…

write.csv(neighborhoods,"neighborhoods.csv")

Now I’ll try and see why so many API requests failed.

neighborhoods %>% filter(is.na(neighborhood)) %>%
      group_by(latitude,longitude) %>%
      summarize( count = n()) %>%
      ungroup() %>%
      arrange( desc(count))
nrow(neighborhoods[is.na(neighborhoods$neighborhood),])
[1] 650

There are many missing neighborhoods. This might not be a problem if they are more obsure pickup/dropoff locations, and don’t occur frequently in the data set.

I’ll revisit this after I’ve joined in the neighborhoods with the rest of the data, and see if I should try and fill these NA’s. I’ll only do so if I can’t make a meaningful Chord plot without doing so.

Joining in the neighorhoods to the dataset
df.large <- df.large %>% 
      mutate(rounded_pickup_latitude = round(pickup_latitude, 2), 
             rounded_pickup_longitude = round(pickup_longitude,2)) %>%
      left_join(neighborhoods, by = c("rounded_pickup_latitude" = "latitude", "rounded_pickup_longitude" = "longitude")) %>%
      mutate( pickup_neighborhood = neighborhood) %>%
      select(-rounded_pickup_latitude, -rounded_pickup_longitude, -neighborhood) %>%
      mutate(rounded_dropoff_latitude = round(dropoff_latitude, 2), 
             rounded_dropoff_longitude = round(dropoff_longitude,2)) %>%
      left_join(neighborhoods, by = c("rounded_dropoff_latitude" = "latitude", "rounded_dropoff_longitude" = "longitude")) %>%
      mutate(dropoff_neighborhood = neighborhood) %>%
      select(-rounded_dropoff_latitude, -rounded_dropoff_longitude, -neighborhood)
mean(is.na(df.large$pickup_neighborhood))
[1] 0.05218883
mean(is.na(df.large$dropoff_neighborhood))
[1] 0.05525133

No problem. Only 5% of our data don’t have neighborhoods. I’ll take that.

One last look:

head(df.large)

I’ll get rid of the annoying X.x and X.y columns from joining, and then we look good. I’ll write this data and use it for the dashborad.

df.large <- df.large %>% select(-X.x, -X.y)
# Write cleaned data
write.csv(df.large, "data/clean.csv")
LS0tCnRpdGxlOiAiRGF0YSBQcm9jZXNzaW5nICIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMjIyMgRmlyc3QsIG9wZW4gYSBzbWFsbCBzYW1wbGUKYGBge1J9CnNldHdkKCJ+L0Rlc2t0b3AvdGF4aXMvIikKZGYgPC0gcmVhZC5jc3YoImRhdGEvc2FtcGxlLmNzdiIsIG5yb3dzID0gMjAwKQpgYGAKCgojIyMjIyBPcGVuIHVwIHRoZSBzaGFwZSBmaWxlcywgdG8gZ2V0IHRoZSBOWUMgbmVpZ2hib3Job29kcyBpbiB0aGUgZGF0YSBmcmFtZSAoZnJvbSB0aGUgbG9uZ2l0dWRlL2xvZ2l0dWRlIHBhaXJzKQpgYGB7Un0KbGlicmFyeShkcGx5cikKbGlicmFyeShSQ3VybCkKbGlicmFyeShSSlNPTklPKQpsaWJyYXJ5KGx1YnJpZGF0ZSkKbGlicmFyeShnZ3Bsb3QyKQpgYGAKIyMjIyMgRnVuY3Rpb25zIGZvciBnZXR0aW5nIG5laWdoYm9yaG9vZCBmcm9tIEdvb2dsZSBNYXBzIEFQSQoKSGVyZSdzIGEgZnVuY3Rpb24gdG8gYnVpbGQgYSBVUkwgdG8gY2FsbCB0aGUgZ29vZ2xlIG1hcHMgQVBJCmBgYHtSfQp1cmwgPC0gZnVuY3Rpb24obGF0aXR1ZGUsIGxvbmdpdHVkZSwgcmV0dXJuLmNhbGwgPSAianNvbiIsIHNlbnNvciA9ICJmYWxzZSIsIHJlc3VsdF90eXBlID0gIm5laWdoYm9yaG9vZCIsIGFwaV9rZXkgPSAiIikgewogIHJvb3QgPC0gImh0dHBzOi8vbWFwcy5nb29nbGUuY29tL21hcHMvYXBpL2dlb2NvZGUvIgogIHUgPC0gcGFzdGUocm9vdCwgcmV0dXJuLmNhbGwsICI/bGF0bG5nPSIsIGxhdGl0dWRlLCAnLCcsIGxvbmdpdHVkZSwnJnJlc3VsdF90eXBlPScsIHJlc3VsdF90eXBlLCAnJmtleT0nLCBhcGlfa2V5LCBzZXAgPSAiIikKICByZXR1cm4oVVJMZW5jb2RlKHUpKQp9CmBgYApgYGB7Un0KbXlfa2V5IDwtICJBSXphU3lDb2tQZGUySGFwYTd0N2dyU1FFZk01TEVlQzhTc053WDgiCmBgYAoKYGBge1J9Cmdlb0NvZGUgPC0gZnVuY3Rpb24obGF0aXR1ZGUsIGxvbmdpdHVkZSwgdmVyYm9zZT1GQUxTRSkgewogIGlmKHZlcmJvc2UpIGNhdChsYXRpdHVkZSxsb25naXR1ZGUsIlxuIikKICB1IDwtIHVybChsYXRpdHVkZSA9IGxhdGl0dWRlLCBsb25naXR1ZGUgPWxvbmdpdHVkZSwgYXBpX2tleSA9IG15X2tleSkKICBkb2MgPC0gZ2V0VVJMKHUpCiAgeCA8LSBmcm9tSlNPTihkb2Msc2ltcGxpZnkgPSBGQUxTRSkKICBpZih4JHN0YXR1cz09Ik9LIikgewogICAgbGF0IDwtIHgkcmVzdWx0c1tbMV1dJGdlb21ldHJ5JGxvY2F0aW9uJGxhdAogICAgbG5nIDwtIHgkcmVzdWx0c1tbMV1dJGdlb21ldHJ5JGxvY2F0aW9uJGxuZwogICAgbG9jYXRpb25fdHlwZSAgPC0geCRyZXN1bHRzW1sxXV0kZ2VvbWV0cnkkbG9jYXRpb25fdHlwZQogICAgbmVpZ2hib3Job29kICA8LSB4JHJlc3VsdHNbWzFdXSRhZGRyZXNzX2NvbXBvbmVudHNbWzFdXSRsb25nX25hbWUKICAgIFN5cy5zbGVlcCgwLjUpCiAgICByZXR1cm4obmVpZ2hib3Job29kKQogIH0gZWxzZSB7CiAgICAgICAgcHJpbnQocGFzdGUoIkVSUk9SOiBzdGF0dXM6IiwgeCRzdGF0dXMsICd1cmw6JywgIHVybChsYXRpdHVkZSA9IGxhdGl0dWRlLCBsb25naXR1ZGUgPWxvbmdpdHVkZSwgYXBpX2tleSA9ICIiKSwgc2VwID0gIiAiKSkKICAgIHJldHVybihOQSkKICAgICAgICAKICB9Cn0KYGBgCiMgVGVzdCBvdXRwdXQKYGBge1J9Cmdlb0NvZGUoNDAuNzE0MjI0LC03My45NjE0NTIpCgpgYGAKCgpub3csIEknbSBnb2luZyB0byBuZWVkIHRvIGFkZCB0aGUgbmVpZ2hib3Job29kIGZvciBlYWNoIHJvdyBpbiB0aGUgZGF0YWZyYW1lLiBCZWZvcmUgSSBkZWFsIHdpdGggdGhhdCwgbGV0J3Mgc2VlIGlmIHRoaXMgdmVjdG9yaXplcyBuaWNlbHkgdXNpbmcgYGRwbHlyYCBhbmQgYHB1cnJyYCAKYGBge1J9Cgp2ZWN0Lmdlb0NvZGUgPC0gVmVjdG9yaXplKGdlb0NvZGUsIHZlY3Rvcml6ZS5hcmdzID0gYygibGF0aXR1ZGUiLCAibG9uZ2l0dWRlIikpCgpgYGAKIyBUZXN0aW5nIApgYGB7Un0KdmVjdC5nZW9Db2RlKGRmJGRyb3BvZmZfbGF0aXR1ZGVbMToxMF0sIGRmJGRyb3BvZmZfbG9uZ2l0dWRlWzE6MTBdKQpgYGAKCk5vdyBsZXQncyBzZWUgaWYgdGhpcyB3b3JrcyBkcGx5ci4gCmBgYHtSfQpkZiAlPiUgdG9wX24oMTApICU+JSBtdXRhdGUoIG5laWdoYm9yaG9vZCA9IHZlY3QuZ2VvQ29kZShwaWNrdXBfbGF0aXR1ZGUsIHBpY2t1cF9sb25naXR1ZGUpKQpgYGAKCkxvb2tpbmcgYXQgd2hlcmUgdGhpcyBicmVha3MgLSBBbGwgdGhlIE5BIHZhbHVlcyBpbiB0aGVzZSAxMCBleGFtcGxlcyBhcmUgSkZLIGFpcnBvaXJ0LiBJJ2xsIG5lZWQgdG8gZmlsbCB0aGVzZSBpbiBsYXRlci4gCgotLS0KCiMjIyMgTm9ybWFsaXppbmcgZGF0YQoKYGBge1J9CgpkZi5sYXJnZSA8LSByZWFkLmNzdigiZGF0YS90cmFpbi5jc3YiKQpucm93KGRmLmxhcmdlKQpgYGAKCnRha2luZyBhIGxvb2sgYXQgdGhlIGRpc3RyaWJ1dGlvbiBvZiBkdXJhdGlvbnM6IApgYGB7Un0KZGYubGFyZ2UgJT4lIGdncGxvdChhZXMoeCA9IHRyaXBfZHVyYXRpb24pKSArIGdlb21faGlzdG9ncmFtKCkKYGBgCgoqKldBQUFZKiogdG9vIHNrZXdlZC4gVGhlIGR1cmF0aW9uIGlzIGluIHNlY29uZHMuIDMsMDAwLDAwMCBzZWNvbmRzPyBUaGF0J3MgYW4gODMgaG91ciB0YXhpIHJpZGUuIE5vIHdheS4gCgpGb3IgbXkgdmlzdWFsaXphdGlvbiBwdXJwb3NlcywgSSdtIGludGVyZXN0ZWQgaW4gInJlZ3VsYXIiIHJpZGVzIC0gcmlkZXMgdGhhdCBhcmUgY29tcGFyYWJsZSBzbyB0aGF0IHRoZSB2aXN1YWxpemF0aW9uIGlzIG1lYW5pbmdmdWwuIApXaXRob3VndCB0aGlua2luZyB0b28gbXVjaCwgSSdsbCBsaW1pdCByaWRlcyB0byB0aG9zZSB1bmRlciB0d28gaG91cnMgbG9uZy4KYGBge1J9CmRmLmxhcmdlICU+JSBmaWx0ZXIodHJpcF9kdXJhdGlvbiA8IDcyMDApICU+JSBnZ3Bsb3QoYWVzKHggPSB0cmlwX2R1cmF0aW9uKSkgKyBnZW9tX2hpc3RvZ3JhbSgpCmBgYAoKYGBge1J9CiMgbGltaXQgdG8gdHJpcHMgb2YgdHdvIG91cnMgaW4gZHVyYXRpb24KZGYubGFyZ2UgPC0gZGYubGFyZ2UgJT4lIGZpbHRlcih0cmlwX2R1cmF0aW9uIDw9IDcyMDApICU+JQogICAgICBtdXRhdGUodHJpcF9kdXJhdGlvbiA9IHRyaXBfZHVyYXRpb24vNjApICMgQ2hhbmdlIHRpbWUgdW5pdHMgdG8gbWludXRzLCBub3Qgc2Vjb25kcwoKYGBgCgpOb3cgdG8gc2VlIGlmIHRoZSBudW1iZXIgb2YgcGFzc2VuZ2VycyBtYWtlcyBzZW5zZTogCmBgYHtSfQpkZi5sYXJnZSAlPiUgZ3JvdXBfYnkocGFzc2VuZ2VyX2NvdW50KSAlPiUgc3VtbWFyaXplKGNvdW50ID0gbigpKQpgYGAKSSdsbCBvbmx5IGtlZXAgcmVjb3JkcyB3aXRoIGJldHdlZW4gMSBhbmQgNiBwYXNzZW5nZXJzLiAKYGBge1J9CmRmLmxhcmdlIDwtIGRmLmxhcmdlICU+JSBmaWx0ZXIoYmV0d2VlbihwYXNzZW5nZXJfY291bnQsIDEsIDYpKQpgYGAKCmBgYHtSfQpkZi5sYXJnZSAlPiUgbXV0YXRlKGRhdGUgPSBkYXRlKHBpY2t1cF9kYXRldGltZSkpICU+JSBncm91cF9ieShkYXRlKSAlPiUgCiAgICAgIHN1bW1hcml6ZShudW0ucmVjb3JkcyA9IG4oKSkgJT4lCiAgICAgIHVuZ3JvdXAoKSAlPiUKICAgICAgZ2dwbG90KGFlcyh4ID0gZGF0ZSwgeSA9IG51bS5yZWNvcmRzKSkgKyBnZW9tX3BvaW50KCkgKyBnZW9tX2xpbmUoKQoKCmBgYApUaGVyZSBzZWVtcyB0byBiZSBvbmUgd2VpcmQgZGF5LCBidXQgSSdtIG5vdCB0b28gd29ycmllZCBhYm91dCB0aGF0LiBUaGUgZGF0YSBzZWVtcyB0byBiZSBwcmV0dHkgY29uc2lzdGVudCBvdmVyIHRoZSBkYXlzIGluIHRoZSBkYXRhc2V0LgoKCiMjIyMjIEFkZGluZyBoYWxmd2F5IGxhdGl0dWRlL2xvbmdpdHVkZXMKCkl0IG1pZ2h0IGJlIGludGVyZXN0aW5nIHRvIHNlZSB3aGVyZSB0aGUgaGFsZndheSBwb2ludHMgYXJlIGZvciBlYWNoIHJpZGUuIEFzIGEgY3J1ZGUgaGV1cmlzdGljLCBJJ2xsIHRha2UgdGhlIHZlY3RvciBhdmVyYWdlIG9mIHRoZSBwaWNrdXAgYW5kIGRyb3BvZmYgbG9jYXRpb25zLiAKYGBge1J9CmRmLmxhcmdlIDwtIGRmLmxhcmdlICAlPiUgbXV0YXRlKGhhbGZ3YXlfbGF0aXR1ZGUgPSAocGlja3VwX2xhdGl0dWRlICsgZHJvcG9mZl9sYXRpdHVkZSkvMiwgCiAgICAgICAgICAgICAgICAgICAgIGhhbGZ3YXlfbG9uZ2l0dWRlID0gKHBpY2t1cF9sb25naXR1ZGUgKyBkcm9wb2ZmX2xvbmdpdHVkZSkvMikKYGBgCgojIyMjIyBBZGRpbmcgdGhlIG5laWdoYm9yaG9vZHMKCkkndmUgZGVmaW5lZCBhIHZlY3Rvcml6ZWQgZnVuY3Rpb24gYHZlYy5nZW9Db2RlYCBmb3IgY2FsbGluZyB0aGUgR29vZ2xlIE1hcHMgQXBpIGFuZCBnZXR0aW5nIHRoZSBuZWlnaGJvcmhvb2QgZ2l2ZW4gYSBsYXRpdHVkZSBvciBsb25naXR1ZGUuIAoKV2l0aCBvdmVyIDEsMDAwLDAwMCByb3dzLCBob3dldmVyLCBJIGNhbid0IGNhbGwgdGhlIEFQSSBmb3IgZXZlcnkgcm93LiBJJ2xsIG5lZWQgdG8gcm91bmQgdGhlIGxhdGl0dWRlL2xvbmdpdHVkZSBjb29yZGluYXRlcyBhbmQgZ3JvdXAgYmFzZWQgb24gdGhlc2Ugcm91bmRlZCBjb29yZGluYXRlcywgYW5kIG9ubHkgY2FsbCB0aGUgQVBJIGZvciB0aGUgYWdncmVnYXRlcyAKCkEgcm91bmRlZCBsYXRpdHVkZS9sb25naXR1ZGUgcGFpciB0byB0aGUgM3JkIGRlY2ltYWwgcG9pbnQgY292ZXJzIGFyb3VuZCAxMTAgc3F1YXJlIG1ldGVycy4gTmV3IFlvcmsgQ2l0eSBjb3ZlcnMgYXJvdWQgNzg5IHNxdXJlIGtpbG1ldGVycy4gVGhlcmVmb3JlIGlmIEkgcm91bmQgdG8gdGhlIDNyZCBkZWNpbWFsIHBvaW50IGFuZCBncm91cCBieSBsYXQtbG9uZyBwYWlycywgSSBzaG91bGQgZXhwZWN0IGFyb3VuZCA3LDAwMCB1bmlxdWUgcm91bmRlZCBwYXJpcy4uLgoKYGBge1J9CmRmLmxhcmdlICU+JSBtdXRhdGUobGF0aXR1ZGUgPSByb3VuZChwaWNrdXBfbGF0aXR1ZGUsIDMpLCBsb25naXR1ZGUgPSByb3VuZChwaWNrdXBfbG9uZ2l0dWRlLCAzKSkgJT4lCiAgICAgIGdyb3VwX2J5KGxhdGl0dWRlLCBsb25naXR1ZGUpICU+JQogICAgICBzdW1tYXJpemUobnVtLm9jY3VybmVjZXMgPSBuKCkpCmBgYAoKMTMsMDAwIHJvd3MuIEVhY2ggUm93IHRha2VzIGFyb3VuZCAuNiBzZWNvbmRzLi4uIFRoYXQgd291bGQgdGFrZSBhcm91bmQgMiBob3VycyAoYXNzdW1pbmcgR29vZ2xlIGRvZXNuJ3QgY2xvc2UgdGhlIGNvbm5lY3Rpb24uIAoKSWYgSSBpbnN0ZWFkIHJvdW5kIHRvIDIgZGVjaW1hbCBwbGFjZXMgKGVhY2ggbGF0LWxvbmcgcGFpciBzaG91bGQgY292ZXIgYXJvdWRuIDEuMSBraWxvbWV0ZXJzKTogCmBgYHtSfQpkZi5sYXJnZSAlPiUgbXV0YXRlKGxhdGl0dWRlID0gcm91bmQocGlja3VwX2xhdGl0dWRlLCAyKSwgbG9uZ2l0dWRlID0gcm91bmQocGlja3VwX2xvbmdpdHVkZSwgMikpICU+JQogICAgICBncm91cF9ieShsYXRpdHVkZSwgbG9uZ2l0dWRlKSAlPiUKICAgICAgc3VtbWFyaXplKG51bS5vY2N1cm5lY2VzID0gbigpKQoKYGBgCgpJIGdldCBvbmx5IDEsMDAwIHJvd3MuIAoKSSdsbCB1c2UgdGhpcyBsZXNzIHByZWNpc2UgcmVwcmVzZW50YXRpb24uIE1vc3QgbGlrZXksIGFuIGVycm9yIG9mIGF0IG1vc3QgMS4xIGtpbG9tZXRlcnMgd2lsbCBub3QgY2F1c2UgYW4gZXJyb3IgaW4gdGhlIG5laWdoYm9yaG9vZCBjbGFzc2lmaWNhdGlvbi4gQWxzbywgd2hhdCdzIG1vcmUgaW50ZXJlc3RpbmcsIHJlYWxseSwgYXJlIHFldXN0aW9ucyBsaWtlOiAiYXJlIHBlb3BsZSBnb2luZyBkb3dudG93biBvbiBmcmlkYXkgbmlnaHRzPyIgb3IgIkRvIHBlb3BsZSB0YWtlIHRheGlzIHRvIEJyb29rbHluIGhvbWUgZnJvbSB3b3JrPwoKYGBge1J9Cm5laWdoYm9yaG9vZHMgPC0gYmluZF9yb3dzKAp1bmlxdWUoZGYubGFyZ2UgJT4lIG11dGF0ZSggbGF0aXR1ZGUgPSByb3VuZChwaWNrdXBfbGF0aXR1ZGUsMiksIGxvbmdpdHVkZSA9IHJvdW5kKHBpY2t1cF9sb25naXR1ZGUsIDIpKSAlPiUKICAgICAgc2VsZWN0KGxhdGl0dWRlLCBsb25naXR1ZGUpKSwKdW5pcXVlKGRmLmxhcmdlICU+JSBtdXRhdGUoIGxhdGl0dWRlID0gcm91bmQoZHJvcG9mZl9sYXRpdHVkZSwyKSwgbG9uZ2l0dWRlID0gcm91bmQoZHJvcG9mZl9sYXRpdHVkZSwgMikpICU+JQogICAgICBzZWxlY3QobGF0aXR1ZGUsIGxvbmdpdHVkZSkpCikKCmBgYApgYGB7Un0KI3RpbWUgZXhlY3V0aW9uCnN0YXJ0aW1lIDwtIFN5cy50aW1lKCkKCiNBZGQgcGlja3VwIG5laWdoYm9yaG9vZApuZWlnaGJvcmhvb2RzIDwtIG5laWdoYm9yaG9vZHMgJT4lCiAgICAgIG11dGF0ZShuZWlnaGJvcmhvb2QgPSB2ZWN0Lmdlb0NvZGUobGF0aXR1ZGUsIGxvbmdpdHVkZSkpCgplbmR0aW1lIDwtIFN5cy50aW1lKCkKYGBgCgpgYGB7cn0Kc3RhcnRpbWUgLSBlbmR0aW1lCmBgYAoKYGBge1J9Cm5laWdoYm9yaG9vZHMKYGBgCkl0IHdvcmtlZCEgSSdsbCB3cml0ZSB0aGlzIHRvIGRpc2sgdG8gYmUgc2FmZS4uLgpgYGB7Un0Kd3JpdGUuY3N2KG5laWdoYm9yaG9vZHMsIm5laWdoYm9yaG9vZHMuY3N2IikKYGBgCgpOb3cgSSdsbCB0cnkgYW5kIHNlZSB3aHkgc28gbWFueSBBUEkgcmVxdWVzdHMgZmFpbGVkLiAKCmBgYHtSfQpuZWlnaGJvcmhvb2RzICU+JSBmaWx0ZXIoaXMubmEobmVpZ2hib3Job29kKSkgJT4lCiAgICAgIGdyb3VwX2J5KGxhdGl0dWRlLGxvbmdpdHVkZSkgJT4lCiAgICAgIHN1bW1hcml6ZSggY291bnQgPSBuKCkpICU+JQogICAgICB1bmdyb3VwKCkgJT4lCiAgICAgIGFycmFuZ2UoIGRlc2MoY291bnQpKQpgYGAKYGBge1J9Cm5yb3cobmVpZ2hib3Job29kc1tpcy5uYShuZWlnaGJvcmhvb2RzJG5laWdoYm9yaG9vZCksXSkKYGBgCgpUaGVyZSBhcmUgbWFueSBtaXNzaW5nIG5laWdoYm9yaG9vZHMuIFRoaXMgbWlnaHQgbm90IGJlIGEgcHJvYmxlbSBpZiB0aGV5IGFyZSBtb3JlIG9ic3VyZSBwaWNrdXAvZHJvcG9mZiBsb2NhdGlvbnMsIGFuZCBkb24ndCBvY2N1ciBmcmVxdWVudGx5IGluIHRoZSBkYXRhIHNldC4KCkknbGwgcmV2aXNpdCB0aGlzIGFmdGVyIEkndmUgam9pbmVkIGluIHRoZSBuZWlnaGJvcmhvb2RzIHdpdGggdGhlIHJlc3Qgb2YgdGhlIGRhdGEsIGFuZCBzZWUgaWYgSSBzaG91bGQgdHJ5IGFuZCBmaWxsIHRoZXNlIE5BJ3MuIEknbGwgb25seSBkbyBzbyBpZiBJIGNhbid0IG1ha2UgYSBtZWFuaW5nZnVsIENob3JkIHBsb3Qgd2l0aG91dCBkb2luZyBzby4gCgojIyMjIyBKb2luaW5nIGluIHRoZSBuZWlnaG9yaG9vZHMgdG8gdGhlIGRhdGFzZXQKYGBge1J9CmRmLmxhcmdlIDwtIGRmLmxhcmdlICU+JSAKICAgICAgbXV0YXRlKHJvdW5kZWRfcGlja3VwX2xhdGl0dWRlID0gcm91bmQocGlja3VwX2xhdGl0dWRlLCAyKSwgCiAgICAgICAgICAgICByb3VuZGVkX3BpY2t1cF9sb25naXR1ZGUgPSByb3VuZChwaWNrdXBfbG9uZ2l0dWRlLDIpKSAlPiUKICAgICAgbGVmdF9qb2luKG5laWdoYm9yaG9vZHMsIGJ5ID0gYygicm91bmRlZF9waWNrdXBfbGF0aXR1ZGUiID0gImxhdGl0dWRlIiwgInJvdW5kZWRfcGlja3VwX2xvbmdpdHVkZSIgPSAibG9uZ2l0dWRlIikpICU+JQogICAgICBtdXRhdGUoIHBpY2t1cF9uZWlnaGJvcmhvb2QgPSBuZWlnaGJvcmhvb2QpICU+JQogICAgICBzZWxlY3QoLXJvdW5kZWRfcGlja3VwX2xhdGl0dWRlLCAtcm91bmRlZF9waWNrdXBfbG9uZ2l0dWRlLCAtbmVpZ2hib3Job29kKSAlPiUKICAgICAgbXV0YXRlKHJvdW5kZWRfZHJvcG9mZl9sYXRpdHVkZSA9IHJvdW5kKGRyb3BvZmZfbGF0aXR1ZGUsIDIpLCAKICAgICAgICAgICAgIHJvdW5kZWRfZHJvcG9mZl9sb25naXR1ZGUgPSByb3VuZChkcm9wb2ZmX2xvbmdpdHVkZSwyKSkgJT4lCiAgICAgIGxlZnRfam9pbihuZWlnaGJvcmhvb2RzLCBieSA9IGMoInJvdW5kZWRfZHJvcG9mZl9sYXRpdHVkZSIgPSAibGF0aXR1ZGUiLCAicm91bmRlZF9kcm9wb2ZmX2xvbmdpdHVkZSIgPSAibG9uZ2l0dWRlIikpICU+JQogICAgICBtdXRhdGUoZHJvcG9mZl9uZWlnaGJvcmhvb2QgPSBuZWlnaGJvcmhvb2QpICU+JQogICAgICBzZWxlY3QoLXJvdW5kZWRfZHJvcG9mZl9sYXRpdHVkZSwgLXJvdW5kZWRfZHJvcG9mZl9sb25naXR1ZGUsIC1uZWlnaGJvcmhvb2QpCgpgYGAKYGBge1J9Cm1lYW4oaXMubmEoZGYubGFyZ2UkcGlja3VwX25laWdoYm9yaG9vZCkpCm1lYW4oaXMubmEoZGYubGFyZ2UkZHJvcG9mZl9uZWlnaGJvcmhvb2QpKQpgYGAKCk5vIHByb2JsZW0uIE9ubHkgNSUgb2Ygb3VyIGRhdGEgZG9uJ3QgaGF2ZSBuZWlnaGJvcmhvb2RzLiBJJ2xsIHRha2UgdGhhdC4gCgpPbmUgbGFzdCBsb29rOgpgYGB7Un0KaGVhZChkZi5sYXJnZSkKYGBgCkknbGwgZ2V0IHJpZCBvZiB0aGUgYW5ub3lpbmcgYFgueGAgYW5kIGBYLnlgIGNvbHVtbnMgZnJvbSBqb2luaW5nLCBhbmQgdGhlbiB3ZSBsb29rIGdvb2QuIEknbGwgd3JpdGUgdGhpcyBkYXRhIGFuZCB1c2UgaXQgZm9yIHRoZSBkYXNoYm9yYWQuCmBgYHtSfQpkZi5sYXJnZSA8LSBkZi5sYXJnZSAlPiUgc2VsZWN0KC1YLngsIC1YLnkpCmBgYAoKYGBge1J9CiMgV3JpdGUgY2xlYW5lZCBkYXRhCndyaXRlLmNzdihkZi5sYXJnZSwgImRhdGEvY2xlYW4uY3N2IikKYGBg